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COMPUTER SYSTEM WITH HEAP RESET 

Pield of th% Invention 

The invention relates to a computer system supporting an 
object-oriented environment having storage, at least a portion of which is 
divided into multiple heaps. 

Background of the Inv ention 

Programs written ±n the Java prograwminj language (Java is a 
trademark of Sun Microsystems Inc) are generally run in a virtual machine 
environment, rather than directly on hardware. Thus a Java program is 
typically compiled into byte-code form, and then interpreted by a Java 
virtual machine (JVM) into hardware commands for the platform on which the 
JVM is executing. The JVM itself is an application running on the 
underlying operating system. An important advantage of this approach is 
that Java applications can rur, on a very wid* range of platforms, providing 
of course that a JVM is available for each platform. 



Java is an object-oriented language. Thus a Java program is formed 
from a set o£ class files having methods that represent sequences of 
instructions (somewhat a*in to eubroutines) , A hierarchy of elates can be 
defined, with each class inheriting properties (including methods) from 
25 those Classes which are above it in the hierarchy. For any given class in 

the hierarchy, its descendants (i.e. below it) are call subclasses, whilst 
its ancestors (i.e. above it) are called superclasses. At run-time objects 
are created as instantiations or these class files, and indeed the cU=e 
£il*s themselves are effectively loaded as objects. One Java object can 
3 0 call a method in another Java object. in recent years Java has become very 

popular, and is described in many books, for example "Exploring Java- by 
Niemeyer and Peck, O'Reilly & Associates, 1996, USA. and "The Java Virtual 
Machine Specification" by Lindholm and Yellin, Addison-weaiey , 1997, USA. 

35 The standard JVM architecture is generally designed to run only a 

single application, although this: can be mul ti - threaded . In a server 
environment used for database transactions and such-like, each transaction 
is typically performed as a separate application, rather than as different 
threads within an application. This is to cneure that every transaction 

40 starts with the JVM in a clean state. In other words, a new JVM is started 

for each transaction (i.e. for each n«w Java application). Unfortunately 
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however this results in an initial delay in running the application (the 
reasons for this will be described in more detail later) . The overnead due 
no this rrequcnt starting and then stopping a JVM as successive 
transactions are processed is significant, and seriously degrades the 
scalability of Java server solutions. 

various attempts have been made to mitigate this problem. EP-962860-A 
describes a process whereby one JVM can fork into a parent and a child 
process, this being quicker than setting up a fresh JVM. The ability to 
run multiple processes in a Java-like system. thereby reducing overhead 
per application, is described in -processes in K^ffcOS: isolation, Resource 
Management, and Sharing in Java" by G back, W Hsieh, and J Lepreau (see: 
http: //www. cs. Utah, edu / f lux/papers/ka£ feos - osdi 0 0 /main . html) , 

Another approach is described in "Oracle JServer scalability ?nd 
Performance" by Jeremy Litzt, July 1999 (see: 

http : www . oracle . com/ datable /document s/ j server_scalability_and_perf ormance_ 
twp.pdf ) . The JServer product available from Oracle Corporation, usa, 
support* the concept of multiple sessions (a session effectively 
representing a transaction or application) , each cession including a 
JServer session. Resources such as read-only bytecode information are 
shared between the various sesaionc, bur each individual session appears to 
its JServer client to be a dedicated conventional JWi. 

US patent application 09/304160, filed 3 0 April 99 ("A long Running 
Reusable Extendible Virtual Machine") , assigned to IBM Corporation (IBM 
docket YOR9-1929-0170) , discloses a virtual machine (vm) having two types 
of heap, a private heap and a shared heap. The former is intended primarily 
ior scoring application clawes, whilst the latter is intended primarily 
for storing system classes and, as its name implies, is accessible to 
multiple VMc- A r^X^ted idea is described in "Building a Java virtual 
machine for server applications: the JVM on OS/3 90" by Dillenberger et al, 
IBM Systems Journal, Vol 39/1, January 2000. Again this implementation uses 
a shared heap to share system and potentially application classes for reuse 
by multiple workers, with each worker JVM also maintaining a private or 
local heap to store data private to that particular JVM process - 

Thft above documents are focused primarily on the ability to easily 
run multiple JVMs in parallel. A different (and potentially complementary) 
approach is based on a serial rather than parallel configuration, xnus it 
is desirable to run repeated transactions (i.e. applications) on the same 
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jvm, since this could avoid having to reload all the system classes at the 
start of each application. However, one difficulty with this i= chat each 
application aX pe=c B to run on a fresh, clean, JVM. There is a danger with 
serial re-use of a JVM that the state left from a previous transaction 
somehow influences the outcome of a new transaction. This unpredictability 
is unacceptable In most circumetances . 

US patent application os/584641 filed 31 May 2000 in the name of IBM 
Corporation (IBM docket number GB9-2000-0061) discloses an ^roach for 
providing a jvm with a reset capability. US provisional application 
60/208268 also tiled 3! May 2000 in tho name of IBM Corporation ( IBM docket 
number YOR9-2000 - 0359) discloses the idea of having two heaps in a JVM. One 
of these is a transient heap, which is used to store transaction objects 
that will not persist into the next transaction, whilst a second heap is 
used for storing objects, such as system objects, that will persist. This 
approach provides the basis for an efficient reset mechanism by simply 
dating the transient heap. The techniques described herein represent 
optimisations ot the above method*, to allow tho jvm reset to be performed 
as quickly and consistently as possible. 

summary of the Invention 

accordingly, the invention provides a computer system providing an 
object -based virtual machine environment for running successive 
applications, said computer system including storage, at least a portion of 
which is logically divided into two or more heaps in which objects can be 
stored, wherein a first heap is reset between successive applications, and 
« second heap persists from one application to the next, said system 
including : 

a card table comprising multiple cards, each corresponding to a 
region of said Storage, each card in the card table being set to null when 
the first heap is reset between successive applications; 

means for marking a card whenever an object in its corresponding 
storage region is created or updated; and 

means for detecting possible references from the second heap to the 
first heap at reset by scanning the csrds in the card table corresponding 
to the second heap, and detecting any cards which have been marked. 

The card table provides a rapid mechanism for identifying potential 
references from the second heap to the first heap, which would prevent 
proper reset of the first heap. This is much quicker Chan scanning the 
entire second heap itself. (Note that immediately after reset it is known 
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that nothing on the second heap can reference anything on the first heap - 
this is a precondition of reset in the first place) . If — y marked cards 
arc pre Se «, which therefore represent potential references, then any 
objects in the corresponding region of storage are located, and examined 
B for any references to the first heap in the located objects. In other 

words, those objects which actually do reference the first heap are 
identified, as opposed to those which have had some other update (c, wich * 
pointer into the 6 .cond h**p, or to some non-pointer data field) - Note that 
this step might not be necessary if the marking were -ore diaoriminatory, 
io in other UO rd s . cards were only marked when specifically a reference to the 

first heap was Inserted. Bow^r, thio checking would seriously impact 
overall system performance, hence it is effectively deferred until the end 
of CHe application. 

in the preferred embodiment, the identification of references to the 
first heap now prompts the system to perform the mark pha=e of » garbage 
collection to determine live objects in at least the second heap. This 
allows the detection o£ any objects in the second heap that are marked as 
live and which have references to the firbt heap. Responsive to the 
20 detection of any such objects, an error condition is returned to prevent 

reset for another application. This reflects the fact that it is not 
possible co r*s«c the first heap if there are still live references into it 
from the second heap; on the other hand, references into it from objects in 
the second heap which are no longer live are not problematic, since these 
25 objects will generally be garbage collected in due course. 

Note that it is al=o nse«.«ary to perform a full mark phase if the 
second heap has been compacted since the previous re=et, because this will 
have invalidated the card table. In fact, it would be possible in theory to 
3 0 move cards at the same time as the compaction is performed, but this is 

rather COTn piex where there is not a one-to-one correspondence between cards 
and objects, and so is avoided in the preferred embodiment. 

in the preferred embodiment, an object is only considered as within 

3 5 the region of storage corresponding to a card if a predetermined part of 

t:h« object (such as its header) is in that region, thereby ensuring that 
each object is uniquely allocated to a particular card. It will be 
appreciated that there is considerable flexibility in the structure of 
cards vised. For example, one possibility would be to have a single card per 

4 0 object, but this leads to variable sized cards and slows down the marking 

P r-oee = s. Thuc preferably the cards each correspond to a uniformly sized 
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region of memory, typically in the range 256 and 2048 bytes. This provides 
a good compromise between storage considerations (not making the card table 
too large) , and *t the same time reasonable discrimination of the actual 
objects which have been updated. 

s 

Preferably t h c system further comprise* means for detecting 
references or possible references to the first heap from a set of 
predetermined locations; and means responsive to the detection of any such 
references or possible references ror returning an error condition to 
10 prevent reset for another application. Examples of the predetermined 

locations are the stacKs and registers,- potential reference, from here to 
the first heap indicate that the objects therein may still be live, and so 
Che Hirst heap cannot be reset. 

1S In the preferred embodiment, the system also detects any objects on 

the first heap which are reachable rrom virtual machine system class 
objects, since the system class objects will be retained for the next 
application, any such detected objects are promoted to the second heap to 
avoid the reset- Note however that if any of these objects to be promoted 

20 actually belong to the application that is just terminating then an error 

will ensue, since the application objects must be deleted at reset in order 
to make way for the nezxit application. 

The invention further provides a computer system providing an 
2 5 object-based virtual machine environment for running successive 

applications, said computer system including storage, at least a portion of 
which is logically divided into two or more heaps in which objects can be 
stored, wherein a first heap is reset between successive applications, and 
a second h« P persists from one application to the next, said system 

30 including: 

means for identifying any objects on the first heap which have a 

finalization method; and 

means for running the finalization methods of any identiried objects 
on the main thread prior to reset of the first heap. 

35 

By running the finalization methods on the main thread, their 
processing becomes effectively synchronous with the reset, so that it can 
be ensured that they have completed before reset. A further advantage is 
chat the finalization methods now run in a controllable context, as opposed 
40 to the generic context of a finalizer thread. 



Received 06-11-00 16:03 



From-01962 818927 



To-THE PATENT OFFICE 



Page 12 



GB930000101GB1 6 

In the preferred embodiment, responsive to running any finalization 
methods, it is verified that they have not performed any operations which 
would present: re Sd t of the first heap. Thus for example, the finalization 
methods themselves may create references from the second heap to the first 
heap. Therefore so much of the precautionary work to determine whether it 

is in fact possible to reset the first heap now needs to typically be 

repeated. 

The invention further provides a method of operating a computer 
system providing w object -based virtual machine environment for running 
successive applications, said computer =y»tcm including ctorag*, at least a 
portion of whiten is logically divided into two or more heaps in which 
objec ts can be stored, wherein a first heap is re*et between successive 
applications, and a second heap persists from one application cb the next, 
said method iacluding the steps of: 

providing a card table comprising multiple cards , each corresponding 
to a region of said storage, each card in the card table being set to null 
when the first neap is reset between successive applications ; 

marking a card whenever an object in its corresponding storage region 
is created or updated; and 

detecting possible references from the second heap to the first heap 
at reset by scanning th* cards in the card table corresponding to the 
second heap, and detecting any cards which have been marked. 

The invention further provides a method of operating a computer 
system providing an object-based virtual machine environment for running 
successive Applications said computer system including storage, at least a 
portion of which is logically divided into two or more heaps in which 
objects can be stored, wherein a first heap is reset between successive 
applications, and a second neap persists from one application to the next, 
sa.id method including the steps of : 

identifying any objects on the first heap which have a finalization 

' method ; and 

running the finalization methods of any identified objects on the 
main thread prior to reset of the first heap. 

The invention further provides a computer program product comprising 
instructions encoded on a computer readable medium for causing a computer 
to perlorm the methods deccribed above, a suitable computer readable medium 
may be a DVD or computer disk, or the instructions may be encoded in a 
signal transmitted over a network from a server. 
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It will be appreciated that the methods and computer program product 
of the invention will benefit from the same preferred features as the 
syctems of the invention. 

5 B^ri^f Description ox the_Drawings 

A preferred embodiment of the invention will now be described in 
detail by way of example only with reference to the following drawings: 

Figure l shows a schematic diagram of a computer system supporting a 

10 Java virtual Machine (JVM) ; 

Figure 2 is a schematic diagram of the internal structure of the JVM,- 
Figure 3 is a flowchart depicting the steps required to load a class 

and prepare it for use; 

Figure 4 is a flowchart depicting at a high level the serial reuse of 

15 a JVM; 

Figure S is a schematic diagram showing the heap and its associated 
components in more detail; 

Figures 6A and SB form a flowchart illustrating garbage collection; 
Figure 7 is a flowchart illustrating heap expansion policy at a high 

2 0 level; 

Figure 6 is a diagram of a lookup table used to determine if a 
reference is in a heap; 

Figure 9 is a diagram of a modified lookup structure for the same 
purpose as Figure 8 # but for use in a system with much larger memory,- and 
25 Figures 10A and 10B form a flowchart illustrating the operations 

taken to delete the transient heap during JVM reset. 

Detailed Description 

30 pig ur6 i illustrates a computer system 10 including a 

(micro) processor 20 which is used to run software loaded into memory 60- 
The software can be loaded into the memory by various means (not shown) , 
for example from a removable storage device such as a floppy dick, CD ROM, 
or DVD, or over a network such as a local area network (LAN) , 

35 telephone/modem connection, or wireless link, typically via a hard disk 

drive (also not shown). Computer system runs an operating system (OS) 30, 
on top of which is provided a Java virtual machine (JVM) 40. The JVM looks 
like an application to the (native) OS 30, but in fact functions itself as 
a virtual operating system, supporting Java application 50. A Java 

4 0 application may include multiple threads, illustrated by threads Tl and T2 

71, 72, 
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System 10 also supports middleware subsystem 45, for example a 
enaction processing environment such as exes. -iUbXc from ibm 
Corporat ion (CICB i- a wad^rlc of IBM Corporation) . The middleware 
subsystem runs as an application or environment on operating system 30, and 
initiate8 the OVM 40. The middleware also includes Java programming which 
accs to cause transactions as Ja-a application, so to run on top of the JVM 
40 in accordance with the present invention, ana as will be <ie 5 crxbcd x» 
more detail below, the middleware =an cause successive transactions to run 
on the same JVM. In a typical server environment, multiple JVM* may b* 
rmmlIig on compute. 3ys tem 10. in one or more middleware environments. 

It will be appreciated that computer system 10 can be a standard 
personal computer or workstation, network computer, minicomputer, 
mainframe, or any other suitable computing device, ana will typically 
include many other components (not shown) such as display screen, keyboard, 
sound card, network adapter card, etc which a~ -t directly r.i~.at to an 
understanding o£ the present invention. Note that computer system 10 may 
also be an embedded system, such as - oer. top box, handheld device, or any 
other hard« a r e device including a processor 20 and control software 30, 40. 

pigure 2 shows the structure of JVM 40 in more detail (omitting some 
components whicn are not directly pertinent to an understanding of the 
present invention) . The fundamental unit of a Java program is the class, 
and thus in order to run any application the JVM must first load the 
classes forming and required by that application. For this purpose the JVM 
includes a hierarchy of class loaders 110, which conventionally includes 
three particular class loaders, named application 120, E«t«n=ion 125. and 
Primordial 130. An application can add additional class loaders to the JVM 
(a class loader is itself effectively a Java program) . in the preferred 
embodiment of the present invention, a fourth class loader is also 
supported., Middleware 124 . 

For each das* included within or referenced by a program, the JVM 
effectively walks up the class loader hierarchy, going first to the 
Application class loader, then the Middleware loader, then the Extension 
class loader, and finally to the Primordial class loader. CO see if any 
class loader has previously loaded the class, xf the response fro m all of 
the claoc loaders ie n«gativ &< then the JVM walks back down the hierarchy, 
with the Primordial class loader first: attempting to locate the c by 
searching in the locations specified in its class path definition. If this 
is unsuccessrul , the Extension class loader then msdcRK a similar attempt, 
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if this fails the Middleware class loader cries. Finally, if this fails the 
Application class loader tries to load the cla S = from one of the locations 
epeoified in its class path (if this fails, or if there is some other 
problem such as a security violation, the system returns an error) . It will 
be appreciated that a different class path can be derined tor each class 
loader . 

Note that if it is desired to load a further middleware class loader 
(i.e. one provided by the user rather than included within the JVM itself), 
then this can be achieved by declaring that the new class loader implements 
the middleware interface. This declaration by itself is sufficient for the 
jvm to treat it as a middleware class loader - no other method definitions 
or such-like arc required. 

The JVM further includes a component CL 204, which also represents a 
class loader unit, but at a lower level. In other word*;, this is the 
component that actually interacts with the operating system to pertorm the 
class loading on behalf of Che different (Java) class loaders 110. 

Also present in th« JVM is a heap 140. which is used for storage of 
objects 145 (Figure 2 shows the heap 140 only at a high level; see Figure 5 
below for more details) . Each loaded class represents an object, and 
therefore can be found on the heap. In Java a class effectively defines a 
type of object, and this is then instantiated one or more times in order to 
utilise the object. Each such instance is itself an object which can be 
found in heap 140- Thus the objects 145 shown in the heap in Figure 2 may 
represent cl^ss objects or other object instances. (Note that strictly the 
class loaders as objects are also stored on heap 14 0, although for the sake 
of clarity they are shown separately in Figure 2) - Although heap 140 is 
shared between all threads, typically for reasons o£ operational 
efficiency, certain portions of heap 140 can be assigned to individual 
threads, effectively as _a small region of local storage, which can be used 
in a similar fashion to a cache for that thread. 

35 The JVM also includes a class storage area ISO, which is used for 

storing information relating to the class files stored as objects in the 
heap i^o. This area includes the method code region 164 for storing byte 
code for implementing class method calls, and a constant pool 162 for 
storing- strings and other constants associated with a class. The class 
storage area also includes a field data region 170 for charing static 
variables (static in this case implies belonging to the class rather than 
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individual instances of the class, or, to put this another way, shared 
between all instances of a class), and an area 168 for storing static 
initialisation methods and other specialised methods (separate from the 
main method code 164). The class storage area further includes a method 
block area 172, which is used to store information relating to the code, 
such as invokers, and a pointer to the code, which may for ox«n*>l« be in 
method code area 164. in JIT code area 185 (as described in more detail 
below), or loaded as native cede such as c, for example « a dynamic link 
library (DLL) . 



Classes stored as objects 145 in the neap i*o contain a reference to 
their associated data such as method byte code etc in class storage area 
_ .1.6.0 ... They also contain aref erence to the cl a = a loader which loaded th-tn 
into the heap, plus other fields such as a flag (not shown) to indicate 
15 whether or uox. they have been initialised. 

Figure 2 further shows a monitor pool 142. This contains a set of 
locks (monitors) that are used to control access to an object by different 
threads- Thus when a thread requires «xclusive access to an object, it 

first obtains ownership of its corresponding monitor. Each monitor can 
maintain a queue of threads waiting for access to any particular object. 
Hash table 141 is used to map from an object in the heap to its associated 
monitor. 



Another component of the JVM is the interpreter 156, which is 
responsible for reading in Java byte code from loaded classes, and 
converting this into machine instructions for the relevant platform. From 
the perspective of a Java application, the interpreter effectively 
simulates the operation of a processor for the virtual machine. 



Also included within chc JVM arc class loader cache 180 **nd garbage 
collection (GC) unit 17S. The former is effectively a table used to allow a 
class loader to trace those classes which it initially loaded into the JVM. 
The class loader cache therefore allows each class loader to check whether 
35 it has loaded a particular class - part of the operation of walking the 

class loader hierarchy described above. None alco that it is part of the 
overall security policy of the JVM that classes will typically have 
dif rerent levels o£ permission within the system based on the identity of 
the class loader by which they were originally loaded. 

40 
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Garbage collection (GO facility 175 is used to delete objects fro™ 
heap 14 0 when those objects are no longer required. Thus in the Java 
programming language, application do not need to specifically request or 
release memory, rather this is controlled by the JVM, Therefore, when Java 
application so creates an object 145, the JVM secures the requisite memory 
resource. Then, when Java application 50 finishes using object ias. the JVM 
can delete the object to free up this memory resource. This latter process 
is Known as garbage collection, and is generally performed by briefly 
interrupting all threads 71, 72, and scanning the neap 140 ror objects 
which are no longer referenced, and hence can be deleted. The garbage 
collection of the preferred embodiment is described in more detail below. 



The JVM further includes a just-in-time (JIT) compiler 190. This 
forms machine code to run directly on the native platform by a compilation 
IS process from the class files. The machine code is cr-ated typically when 

the application program is started up or when some other usage criterion is 
met, and is then stored for future use. This improves run-time performance 
by avoiding the need for this code to be interpreted later- by the 
interpreter 156. 

20 

Another component of the JVM is the stack area 195, which is used for 
Storing the stacKS 196, 198 associated with the execution of different 
threads on the JVM. Note that because the system libraries and indeed parts 
of che JVM itself are written in Java, and these frequently use 
25 multi-threading, the JVM may be supporting multiple threads even if the 

user application 50 running on top of the JVM contains only a single thread 
itself . 

It will be appreciated of course that Figure 2 is simplified, and 
3 0 essentially shows only those components pertinent to an understanding of 

the present invention. Thuc for example the ho^p may contain thousands of 
Java objects in order to run Java application 50, and the JVM contains many 
other components (not shown) such as diagnostic facilities, etc - 

3S Figure 3 is a flowchart illustrating the operations conventionally 

performed to load a class in order no run a. Java application. The first 
operation is loading (step 310) in which the various class loaders try to 
retrieve and load a particular class. The next operation is linking, which 
comprises three separate steps. The first of these is verification (step 

4 0 3 20) , which essentially checks that the code represents valid Java 

programming, for example that each instruction has a valid operational 
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code and that each branch instruction goes to the beginning of another 
instruction (rather than the middle of an instruction) . Thi* is followed by 
preparation (step 330) vhich amongst other things creates the static field, 
for a class. The linking process is completed by the step of resolution, m 
which a sytnbolic reference to another class is typically replaced by a 
direct reference (step 340) . 

At resolution the JVM may also try to load additional classes 
associated with the current class. For c~™ P i e . if eh* current class calls 
a method in a second class then the second class may be loaded now. 
Likewise, if che current class inherit* from a superclass, then the 
superclass may also be loaded now. This can then be purged -cu-iv^y; « 
other wordo, if the second class calls methods in further classes, or has 
one or more superclasses, these too may n™ be loaded. Note th*t it is up 
to the JVM implementation how many classes are loaded at this stage, as 
opposed to waiting until such cla S =cc arc actually needed before loading 
chem- 

The final step in Figure 3 is the initialisation a £ a loaded class 
(step 350) , which represents calling the static initialisation method (or 
methods) of the class. According to the formal JVM specification, this 
initialisation must be performed once and only once before the first active 
use of a class, and includes things such as setting static (class) 
variables to their initial values (see the above-mentioned book by Lindholm 
and Yellin for a definition o£ -first active use") . Note that 
initialisation of an object also requires initialisation of its 
superclasses, and so this may involve recursion up a superclass tree in a 
similar manner to that described for resolution. The initialisation flag in 
3 class object 145 is set as part of the initialisation process, thereby 
ensuring that the close initialisation ie not subsequently re- run. 

The end result of the processing of Figure 3 is that a class has been 
loaded into a consistent and predictable state, and is now available to 
interact with other classes. In fact, typically at start up of a Java 
program and its concomitant JVM, some 1000 objects are loaded prior to 
actual running of the Java program itself, these being created rrom many 
different classes. This giv«s some idea of the initial delay and overhead 
involved in beginning a Java application. 

As mentioned above, the problems caused by this initial delay can be 
graa tly reduced by serial reuse of a JVM. thereby avoiding the need to 
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reload system classes and so on. Figure 4 provides a high-level flowchart 
of a preferred method for achieving such serial reuse The method «o-«e» 
with the -tart: of the middleware subsystem 4S, which in cum uses the Java 
Native interface (JNI) to perform a Create JVM operation (step 410) - Next 
a n application or transaction to run on the JVM is loaded by the 
Application class loader 120. The middle* in=ludt« Java routines to 
provide various services to the application, and tnese are also loaded at 
ChiS point, fay the Middleware clsss loader 124. 

The a P pli= a tion can now be run (step 420), and in due course will 
finally terminate. At this point, instead o£ tern,i„«ing the JVM as well as 
.he application, the middleware subsystem makes a Reset JVM call to the ovm 
(step 430). The middleware cl asS o S may optimally include a tidy-up method 
and/or a reinitialize method. Both of these are static methods. The JVM 
responds to the Reset JVM by calling the tidy-up method of the middleware 
classes (step 440). The purpose of tnls i- to allow the middleware to leave 
tl » JVM in • tidy state, for example removing resources and closing riles 
that are no longer required. and deleting »f.»ne.. to the application 
objects. i» particular, all those middleware classes which have been u=ed 
since the previous ovm reset (or since the JVM was created if no resets 
have occurred) have their tidy-up method called, assuming of course that 
they have a tidy-up method (there is no retirement for them to have such a 
tidy-up method) . 

The tidy-up method may be similar to the finalise method of a class, 
which is a standard Java facility to allow an object to perform some 
close-down opemion. However, there is an important difference in that 
tidy-up is a static method. This means that contrary co the finalise method 
it applies to the ela=* rather than any particular object instance, and so 
will be called even if there are no current object instances for that 
C l ass . m addition the timing of the tidy-up method is different rrom 
finalise, in that the former is called in response to a predetermined 
command to reset the JVM. In contrast, in accordance with the JVM 
specification, the finalise method is only triggered by a garbage 
collection. More particularly, if an object with a finalizer method is 
fo««a to b* unreachable during a garbage collection (ie it is no longer 
effectively active) then it i= queued to the finalize thread, which then 
runs the finalizer method after the garbage collection is completed. Note 
that cue finalizer method of an obj*ct may never be called, if an 
application finishes and the JVM shuts down witnout the eystom needing to 

perform a garbage collection. 
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Once the tidy-up has been completed, a refresh heap operation is 
performed (step 445) . As will be described in more detail below, this 
deletes thoee portions of the heap that relate to the application or 
transaction that has just been completed, generally analogous to a garbage 
collection cycle. Note that many of the objects deleted here might not have 
been removable prior to the tidy-up method, since they covald still have 
been referenced by the middleware classes. 

At this point, the middleware subsystem makes a determination of 
whether or not there is another application to run on the JVM (step 450) - 
Xf not, the middleware subsystem uses the JNX to make a Destroy jvm call 
(step 460) which terminates the JVM, thereby ending the method of Figure 4. 
It on the other hand there is another application to run, then this new 
application is started by the middleware. The system responds to this new 
15 application by calling in due course the reinitialisation method in each of 

the middleware classes to be reused (step 455) . The purpose of this ic to 
allow the middleware classes to perform certain operations which they might 
do at initialisation, thereby sidestepping the restriction that the JVM 
specification prevents the initialisation method itself being. called more 
20 than once. As a simple example, the reinitialisation may be used to reset a 

clock or a counter. As shown in Figure 4, the system is now in a position 
to loop round and run another application (step 42 0) . 

It is generally expected that the reinitialisation method will be 
25 similar in function to the initialisation method, but there may well be 

some differences. For example, it may be desired to reset static variables 
which were initialised implicitly- Axioth«r possibility is to allow some 
state or resources to persist between applications; for example, if & class 
always outputs to one particular log file which is set up by the 
initialisation method, it may be more efficient to keep this open in 
between successive JVMs, transparent to the application. 



It should be noted that whilst Figure 4 indicates the distinct 
logical steps performed by the method of the invention, in practice these 

35 steps are not all independent. For example, calling the tidy-up methods 

(step 4*0) i* part of the overall reset JVM operation (step 430) . Likewise, 
calling the reinitialisation methods (step 455) is effectively p^rt of the 
start-up processing of running the new application (step 420) . Thus 
reinitialisation is performed prior to first active use of a class, and 

4 0 this may occur at any stage of a program. Thererore class reinitialisation 

(li)ce conventional initialisation) is not necessarily completed at start-up 
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of the program, but rather can be regarded as potentially an ongoing 
process throughout the running oj: a program. 

It. will also be appreciated that there is some flexibility with 
5 regard to the ordering of the steps shown in Figure 4. In particular, the 

decision o£ whether or not th«r<& is to be another application (step 450) 
could be performed earlier, such as prior to the refresh heap step, the 
tidyup stop, and/or Tine reset JVM step. In the latter case, which 
corresponds to immediately after the firsc application hac conclude (i.e. 
10 straight after step 420) , the alternative outcomes would be to destroy Che 

JVK (step «60) if there were no further applications, or else to reset the 
JVM, tidy up, refresh the heap, and reinitialise (steps 430, *40, 445, and 
455) i£ th<sr<* were further applications. If instead the decision step 450 
is intermediate these above two extreme positions, the logic flow can b«~ 
15 determined accordingly. Further details about the vmp lenient at ion of the 

tidyup and reinitialise methods arc provided in above-mentioned US patent 
application 09/584641. 

It should be noted that in the preferred embodiment, the ability to 
20 reset che jvm, and to have tidyup and reinitialise methods, is only 

available for middleware classes (i.e. those loaded by the middleware class 
loader) - This is to allow the middleware classes to be re -used by 
successive applications or uransactions, for which they can perform various 
services. The basis for this approach is that typically the middleware is a 
2 5 relatively sophisticated and trusted application, and so can be allowed to 

take responsibility for proper implementation of tne tidy-up and 
reinitialise methods. On the other hand, the transactions that run within 
the middleware are not treated as reliable. 



3 0 Note also that the system classes themselves do not have tidyup or 

reinitialisation methods, despite persisting across a JVM reset. Rather, if 
the middleware makes any change to a system class, then the middleware 
itself is expected to take the necessary action <if any) for a reset with 
respect to the system class as part of the middleware's own tidyup 

3 5 operation. 

An important part of the reset JVM/tidyup operation (steps 430 and 
440) in the preferred embodiment is to make sure that the JVM is in a state 
which is am^i^bie to being tidied up. If this is the case, the JVM is 
40 regarded as being clean, if not, it is regarded as being dircy or 

contaminated . 
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Considering chis in more detail, if the application has performed 
certain operations, then it will not be possible for the middled classes 
*o bo certain their tidy-UP and reinitialise methods will fully reset 

the system to a fresh state. With such a contaminated JVM, the system still 
calls the tidy-up methods of the class objects as per normal (step 
but the return code back to the middleware associated with the reset JVM 
operation (step 430) effectively indicates failure. The expectation 
chat chc enm would act^lly be terminated by the middleware subsystem at 
this point, as it is no longer in a predictable condition. 

One important situation which would prevent the JVM from being able 
to properly reset is where the application has performed certain o P c r atio na 
directly =uch a S maki-s security or environment changes, running native 
code or performing Abstract Windowing Toolkit <*WT> operations. These 
affect the state of the JVM or the underlying computer system and cannot be 
reliably tidied up by the middleware, for th« =im P l* reason that the 
middleware does not necessarily know about them. Such changes could th cn 
persist through a reset OVM call, and contaminate the JVM for any future 
applications. In contrast, if an application performs such operations 
through a middleware call, then this does not cause any problems, because 
the middleware now does know about the situation and so can perform 
whatever tidyup measures are recjuired. 

The JVM thus monitors for operations that may prevent proper reset, 
• including whether they have been performed by an application or middleware. 
This is determined by Che JVM keeping trac* of its context, which i. to 
application cc« e xt for an application class, and to middleware context for 
a middleware class, whilst a primordial or exterior* elasc has no impact on 
rhe eyeing context of application or middleware. In particular, context 
can be determined cased on the typo of olass which contains the method that 
is currently being performed, whilst the type of class is determined from 
its original class loader. 

as previously mentioned, the list of problematic operations given 
above only causes difficulty when performed in an application context, 
Kin = e in a middleware context it is possible for them to be reset by the 
appropriate tidy-up routines of the rel«v*nt middleware classes. 

Referring now co Figure s. in the preferred embodiment the heap 140 
is logically split into three components (objects in one component can 
reference objects in another component). In particular, at the bottom 
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(logically) of heap 140 is middleware section 510, and at the top of the 
heap is transient section 520. The data in the** two h*aps grows towards 



ea ch other, thus transient heap grows in the direction ot arrow 521, *nd 
middleware heap in the direction of arrow 511. The middleware heap is 
5 defined by boundary 512, and the transient heap by boundary 522, with 

assigned apace 515 between them. It should be appreciated that boundaries 
512 and 522 represent the maximum size currently assigned to the two heaps, 
»th** than their current fill levels - these are instead shown by dashed 
lines 513 and 523. in other words, as the middleware heap fills up, the 
10 fill level 513 will approach towards middleware heap boundary 512; likewise 

as the tr^icnt heap fills up, th, fill level 523 will approach towards 
transient heap boundary 522. Finally, and separate from the transient h** P 
and middleware heap, is system heap 550. Note that the cotnbined transient 
and middleware heaps, Together with intervening u»aseign*d space, are 
allocated from a single physically contiguous block of memory 560. in 
contrast, the system heap SS0 may he formed from multiple non- contiguous 
regions of memory. 

In one preferred embodiment, memory 5(50 comprises 54 MBytes, and the 
initial *i=* of the middleware and transient heaps is 0 . 5 Mbyte each, Thus 
it can be seen that initially the unassigned region 515 dominates, although 
as will be discussed in detail below, the transient and middleware heaps 
are allowed to expand into thU space- However, these values are exemplary 
only, and suitable values will vary widely according to machine 
architecture and sizo, and also the type of application. 



Heap control block 530 is used for storing various information about, 
the heap, such as the location of the h^ap within memory, and the limits of 
the transient and middleware sections as defined by limits 512 and 522. 
Free chain block S32 is used for listing available storage locations within 
the middleware and transient sections (there is actually one free chain 
block for each section) . Thus although the middleware and transient heaps 
start to fill sequentially, the likely result of a garbage collection cycle 
i* that space may become available within a previously occupied region. 
Typically therefore there is no single fill line such as 513 , 523 between 
vacant and occupied space, rather a fragmented pattern. The free chain 
block is a linked list which specifies the location and size of empty 
regions within that section of the heap. It is quick to determine whether 
and wh^re a requested amount of storage is available in the heap by simply 
scanning through the linked list. Note that in the preferred embodiment, 
empty regions in the heap which are below a predetermined size (typically a 
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few hundred bytes) are excluded from the free chain list. This prevents the 
list from becoming too long through containing a large number of very small 
vacant regions, although it does mean that these regions effectively become 
inaccessible for storage (although they can be retrieved later, as 
5 described in more detail below) . 

The transient heap 520 is used for storing objects having no 
ejected curacy beyond the end of the application or transaction, 
including application object instances, and primordial object instances and 
10 arrays created by application methods (arrays can be regarded as a 

specialised form of object) . Since the lifetime o£ such objects is 
commensurate with the application itself, it should be possible to delete 
all the objects in the transient heap at the end of the application. The 
application class objects are also on the transient h«=a P . in contrast, the 
15 middleware heap 510 is used for storing objects which have a life 

expectancy longer than a single transaction, including middleware object 
instances, and primordial object instances and arrays created by middleware 
methods. H: addition, string objects and arrays for strings interned in the 
interned String Table are also stored in the middleware heap (the interned 
20 string Table i= a tool whereby if multiple identical strings are to be 

stored on the heap, it is possible to store only one copy of the string 
itself , which can then be referenced elsewhere) . Lastly, the system heap 
550 is used for storing primordial class objects and reusable class 
Objects, where the term reusable class object is used to denote a class 
25 which can be used again after JVM reset. 

T.ie type of class is dependent on the class loader whi^h originally 
loaded it, in other words a middleware class and an application class are 
loaded by the middleware class loader 124 and the application class loader 
30 120 respectively. For the purposes of the present discussion, primordial 

classes can be considered as classes loaded by the Primordial or extensions 
class loader (130 and 125 respectively in Figure 2) . In the preferred 
embodiment, classes loaded by the middleware class loader ar<= automatically 
regarded as reusable . 



35 



40 



It is clear from above that instances of primordial classes, such as 
the basic string class java/lang/String, can be located either in the 
middleware heap or the transient heap, depending on the method which 
created them. In * preferred embodiment of the present invention, the 
determination of where to place such primordial class instances is based on 
the current context described above (also referred to as method-type) . Thus 
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if a method belonging to an application class is invoked, the context: or 
method- type becomes Application, whilst if a method belonging no a 
middleware class is invoked, the method-type becomes Middleware. Finally, 
if a method belonging to a primordial class is invoked, the method- type is 
xxneh^rxged from its previous value. The context or method- type is stored in 
the Java frame for the method (which io stored on stack 195 - see Figure 
2); at the completion of the method, the method-type reverts to its value 
at the cimc the method was invoked, which was stored in the previous frame. 

it should b«? noted that for the above purpose a method belongs to the 
class that actually defines it. For example, if clacs A subclasses cl^cc B , 
but does not override method C, then method C belongs to class B. Therefore 
cue method- type is that of class B, ev«n if method c is being run for an 
instance of class A. In addition, the reason for tracking method-type on a 
15 per-thread basis is that it is possible for various threads within an 

application to be executing different methods having different context. 

The transient region of Lhe heap, containing objects created by the 
application or transaction, is subject to normal garbage collection, but 
20 the intention is that it will be sufficiently large that this is unlikely 

to occur within the lifetime of a typical application. At the end of each 
application, the tr^ient region of the heap is reset. (The repetition of 
this pattern will thereby avoid having to perform garbage collection during 
moct typical applications) . In contrast the middleware region generally 
25 contains objects created by the crusted middleware, it is again subject to 

conventional garbage collection, although in a transaction environment in 
ls e5cpcctc d that the majority of object* will be created in the transient 
heap, so that garbage collection is not expected to occur frequently. 
Moreover the system typically tries to perform garbage collection of the 
3 0 middleware heap at the same time as reset of the transient heap, in och*r 

words between rather than during transactions (this is discussed in more 
detail below) . The middleware heap is not cleared between applications, but 
rather remains to give the middleware access to its persistent state (it is 
assumed that the middleware can take responsibility for resetting itself to 
35 the correct state to run the next application) . 

The preferred embodiment is actually somewhat more complicated than 
described above, in that it supports two types of application class loader, 
one of which i© for standard application classes, the other for reusable 
40 application classes. The motivation here is that when the next transaction 

io to run, it will in fact require many of the same application classes as 
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the previous transaction. Therefore it is desirable to retain some 
application system classes racher chan having to reload them, although 
certain additional processing is required to make them look newly loaded to 
the next transaction. Conversely it would be possible to have a second 
s middleware class loader which is for non-reusable middleware class", in 

the former situation the reusable application classes are treated 
essentially in the same manner as the reusable middleware classes, (eg 
loaded into the system heap) ; in the latter situation the non-reusable 
middleware classes would be treated similarly to th* non-reusable 
10 application classes but loaded into the middleware heap (since they may 

exist after the conclusion of a transaction, even if they do not endure for 
the next transaction) . However, for present purposes in order to explain 
the invention more clearly, it will be assumed that all the middleware 
"classes are' reusable' "and that none of the "application -classes ^re 
15 reusable. 

The introduction of multiple heaps for different types of objects 
allows the handling of the neap to be fine-tuned to the requirements of 
those types of object. For example, xc may ne desirable for the transient 
20 heap to allocate a larger thread local heap cache- in addition, utilising a 

single block of memory for the transient and middleware heaps improves 
space u*age. in that a given region of memory can be flexibly assigned to 
either the transient or middleware heap, depending on particular 
application requirements. On the other hand it does lead to some 

25 complications in terms of heap management, especially as regards control of 

heap size. Thus in simple terms, as more and more objects are created, 
there is a choice to either enlarge the size of the heap, or to perform a 
garbage collection to maintain the heap within current size limits- The 
former option is generally quick, but will eventually lead to tne 

3 0 exhaustion Of neap space; in contrast, a garbage collection is relatively 

slow, since it interrupts processing, but does constrain the heap eizc to 
within predetermined limits- Overall, the preferred embodiment tries to 
avoid garbage collections during transactions a= much c*g possible, thereby 
optimising performance for the transaction, and to rely instead on the heap 

3 5 refresh described below, which is performed at the end of the transaction 

as part of the JVM reset . 

More specifically, the policy for expansion and garbage collection in 
tcr-m= of system h<aap sso is straightforward, in that objects in this heap 

4 0 are never garbage collected; rather this heap cimply ©xpande to accommodate 

3.11- relevant class objects. However, the policy for transient and 
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middleware heaps is more complex, because these two heaps are 
interdependent, in that they sh&rc the same memory sp ae*. in order to 
better understand this policy, it will be helpful to firstly review in ^r C 
detail the garbage collection strategy of the preferred embodiment, as 
shown in Figures 6A and 6B. In particular, the method involves firstly a 
mark phase, which marks ail objects in the heap that are currently in use 
(known as live or active objects), and secondly a sweep phase, which 
rep re fi Bnt S the actual deletion of objects from the heap. Note that general 
background on garbage collection algorithms can bo found in "Garbage 
Collection: Algorithms for Automatic Dynamic Memory Management" by R Jones 
and R Lins, Wiley, 1996 (isbw o 471 4), whilst one implementation for 

garbage collection in a system having multiple heaps is described in, -A 
customisabU memory management framework for C++" by G Attardi , T Flagella, 
and P Iglio, in Software Practice and Experience, vol 28 /ii, 



1998 . 



As shown in Figure 6A, the method starts; with a review of the 
registers and stack, both the Java stack, as shown in Figure 2. and also 
the C stack, laocuming that the JVM 40 is running as a C application on OS 
30, see Figure 1) (atep 610). Each thirty-two bit data word (for a 32-bit 
system) contained therein could represent anything, for example a real 
number, or part of a string, but it is assumed at least initially that it 
may denote a 32 bit reference to an object location in the heap, to firm up 
on this assumption, three teste are mad* . Firstly, it is tested whether or 
not the number references a location within the heap (step 612) ; if not 
25 chen the number cannot represent an object reference. Secondly, in the 

preferred embodiment, all objects commence on an 8-byte boundary. Thus if 
the location corresponding to the data word from the stack/register does 
not fall on an object boundary (tested at seep 615), then the original 
assumption that the data/number represents a reference to the heap must 
3 0 again be rejected. Thirdly, in the preferred embodiment, a table 53 8 is 

maintained (see Figure 5) which has a bit £or each object location in the 
heap; this bit is set to unity if there is an object stored at that 
location, and zero if no object is stored at that location (the relevant 
bit is updated appropriately whenever an object is created, deleted, or 
35 moved) . If the data word from the stack/register corresponds to an object 

location for which the bit is zero, in other words, no object at that 
location, then once more the original assumption that the data/number 
represents a reference to the heap must be rejected (step 620). If the data 
word passes all three of the tests of Steps 612, 615 and 620, then there 
are three remaining possibilities: (a) the word reference* an object on the 
heap; <b) the word is an integer that happens to have the same value as the 
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object reference; or (c) the word is a previous value from uninitialized 
storage. As a conservative measure, it is assumed that option (a) is 
correct, and SO the object is marked as live (step €25) . ^ special array of 
bits is provided (block 534, see Figure 5). one bit per object, in order to 
S store these mark bits, if there remain other values on the stacks/registers 

to test (step 630), the method then loops back to examine these in the same 
manner as jvist described; if not the first stage of the mark process is 
complete . 

3,0 in the second stage o£ the mark process, shown in Figure 6B- the 

objects marked as live are copied onto a list of active objects (step 635) 
(in the preferred embodiment objects are actually copied to the active list 
when originally marked, ie at the same time as step 62S in Figure 6A) . An 
object from this list is then selected (step 64 0) , and examined to see if 

15 it contain any references (step 645) . Note that this is a reasonably 

straightforward procedure, because the structure of the object is known 
from its corresponding class file, which defines the relevant variables to 
be used by the object. Any objects referenced by cue selected object are 
themselves marked (ctep 650) and added to the active list (step 655) . Next, 

2 0 the selected object is removed from the active list (step 66 0) , and then a 

test is performed (step 665) to determine if the active list is empty; if 
not, processing loops back to step 640 to select another object from the 
active list. Finally, when step 665 produces a positive outcome, all 
objects chat are active, because ch*y are referenced directly or indirectly 

25 from the stacks or registers, have been appropriately marked. 

The mark stage is then rollowed by o. sweep stage (ctep 670) and a 
compact stage (step 675) . The former garbage collects (ie deletes) all 
those objects which have not been marked, on the basis that they are no 

30 longer reachable from any live or active object. In particular, each object 

which is not marked as active has its corresponding bit set to zero in 
table 53S (see Figure 5) . Runs of zeros in the bit allocation table 53 8 are 
now identified; these correspond to some combination of the object 
immediately preceding the run, which may extend, into the run (since only 

35 the head of an object is marked in the bit allocation table), and free 

space (released or never filled) - The amount of free space in the run of 
zeros can be determined by examining the size of the object immediately 
preceding th«? run. if the amount of free space exceeds the predetermined 
minimum amount mentioned earlier, then the run is added to the free chain 

40 list 532 (see Figure 5) . 



Received 06-11-00 16:03 



From-01962 818927 



To-THE PATENT OFFICE 



Page 29 



10 



15 



GB920000101GB1 23 

Over time, such sweeping will tend to produce many discontinuous 
vacant regions within the heap, corresponding to the pattern of deleted 
objects. This does not represent a particularly efficient configuration, 
and in addition there will be effective loss of those pieces of memory too 
small to be on the free list . Hence a compact stage (step 675) can be 
performed, which acts to squeeze together those objects which remain in the 
heap after the sweep in order to amass them into a single continuous block 
of storage (one for the transient heap, one for the middleware heap) . 
Essentially, this means relocating objects from their initial positions in 
the heap, to a new position so that, as much as possible, they are all 
adjacent to one another. As part of this compaction, the very small regions 
of memory too small to be on the free chain S32 (see Figure 5) should be 
aggregated into larger blocks that can be recorded in the free chain. 

An important requirement of the object relocation of the compaction 
step is of course that references to a moved object are altered to point to 
its new location. This is a relatively straightforward operation for Object 
references on the heap itself, since as previously mentioned, they can be 
identified from the known structure of each object, and updated to the 
appropriate new value. However, there is a problem with objects which are 
directly referenced from a register or stack. As discussed above, each 
number in the register/stack is treated for garbage collection purposes as 
if it were an object reference, but there is no certainty that this is 
actually the case,- rather the number may represent an integer, a real 
number, or any other piece of data. It is therefore not possible to update 
any object references on the stack or register, because they may not in 
fact be an object reference, but rather some other piece of program data, 
which cannot of course be changed arbitrarily. The consequence of this is 
that it is impossible to move an object which appears to be directly 
referenced from the heap or stack; instead these objects must remain in 
their existing position. Such objects are informally known as "dozed- 
objects since they cannot be moved from their current position. 

Two other classes of objects which cannot be moved from the heap are 
35 class objects, and thread objects (thread objects are control blocks used 

to store information about a thread) . The reason for this is that such 
objects are referenced from so many other places in the system that it is 
not feasible to change all these other references. These objects are 
therefore known as "pinned", since like dozed objects they cannot be moved 
4 0 from their current position. 
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A consequence of pinned and dozed objects is that a compact process 
may not be able to accumulate all objects in a heap into a single 
contiguous region of storage, in that pinned and dozed objects must remain 
in their original positions. The consequences of this are discussed in more 
S detail below. 

Note that in the preferred embodiment, a compact stage (step £75) is 
not ncceasarily employed on every garbage collection cycle, unless this is 
explicitly requested as a user initial oct-up option. Rather a compact 

3_o operation is only performed when certain predetermined criteria are met. 

For example, as previously indicated a garbage collection can be triggered 
by a request for storage in the heap that cannot he satisfied, xf the 
request still cannot be satisfied after the sweep step 670, because there 
is no single block of memory available of sufficient size, th*n a compact 

15 stage is automatically performed, to try and accumulate an adequately- si zed 

storage region. 

In the preferred embodiment, the further criteria used for deciding 
whether to compact are different for the middleware heap and the transient 

20 heap. Thus for the transient heap a compaction is performed whenever the 

amount of free space remaining in the transient heap after the garbage 
collection is less than 5% of the heap capacity. The idea here is that when 
space appears to be running out, the compacting should retrieve some 
additional space from those empty regions too small for the free chain 

25 list. On the other hand, for th* middleware heap more complex compaction 

algorithms are used, based for example on when heap fragmentation exceed;? 
certain limits (eg in terms of number of fragments) , or where the largest 
block in the free chain list Is below a certain siee. Th^ rationale here is 
that the middleware heap is likely to be of relatively long duration, and 

30 so it is worthwhile to try to optimise its overall storage arrangement. 

Note that although the triggers for garbage collection and compaction 
can be different for the middleware ana transient heap, when either 
operation ie performed, in the preferred embodiment it is performed on the 
35 whole of active storage 560 - ie on both the middleware and transient 

sections simultaneously. This is because interheap references are 
permitted, and so any marking or compaction operation necessarily involves 
both heaps. Consequently, once starting a garbage collection or compaction, 
it i= moat effecient to do both heaps at the same time. 

40 
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One complication to the garbage collection described above is that as 
previously mentioned, Java permits objects to have finalizer methods, which 
must be run prior to deletion o€ the object in a garbage collection. In 
order to manage this requirement, certain additional processing is required 
(not shown in Figure 6) . Thus when an object is created on the heap that 
has a finalizer method, a reference to that object is added to a act of 
fi na li S ar references. At the end of the mark phase of garbage collection, 
this set of finalizer references is scanned, to detect any objects in the 
set which are not marked - the resultant group represents the objects which 
are about to be deleted, and so need to have their finalizer methods run. 
To accomplish this, objects in this group now need to be marked « live, 
and their references iteratively traced and also marked as live, in similar 
fashion as- for the main mark phase. The purpose of thic is firstly to 
retsviri the objects in order to run their finalizer methods, and secondly to 
retain any other objects which are directly or indirectly referenced by 
them, so that the finalizer methods run correctly. The rinalizer references 
£or objects in thic g^o-up are removed from the set of finalizer references 
described above, so that their finalizer method will not be activated by 
any future garbage collection cycle, and passed to a reference handler. The 
subsequent processing is asynchronous, and does not occur until main sryetem 
processing is resumed after the garbage collection has concluded (ie after 
the end of the processing of Figure SB) . Once the reference handler has 
restarted, it passes any object finalizer references it received during the 
garbagc coX lc:c:tion to a finali^r queue. A separate finalizer thread then 
runs each entry in the queue in turn, deleting the object reference from 
the queue after the corresponding finalizer method has been run. 

Note that objects referenced by the reference handler or on the 
finalizer queue are regarded cld -live" during * garbage collection process. 
In other words they are marked along with any other objects which they 
reference, directly or indirectly. This ensures that objects do not get 
inadvertently deleted from the finalizer queue, if their wait on this queue 
exceeds the nime to the next garbage collection. (Thus objects in the 
reference handler and finalizer queue form additional roots for live 
objects, in addition to those on the stacks and registers as illustrated in 
Figure 6; in fact in the preferred embodiment , there are other categories 
of roots, for example system class files, hut the details are not 
pertinent to an understanding of the present invention) . 

One potential problem with the handling of finalizer methods 
described above is that by running them on a dedicated thread (the 
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f inalizer thread) , the context of the thread will be different from the 
main application thread, where concexc here indicates general system 
properties associated with the thread, such as security permissions. This 
can be a particular concern in relation to transaction threads, which as 
5 previously mentioned are regarded as relatively untrustworthy. Therefore, 

che preferred embodiment modifies tho handling of objects in the transient 
heap having finalizer methods. If these are located in a garbage collection 
cycle and are not marked, then as described above they are marked, along 
with the objects which they reference, directly ox- indirectly. However, no 

10 further processing is done on these objects , in particular, they are not 

removed from the set of finalizer references, and are not passed to the 
reference handler. The effect of this is that these object then simply 
continue to appear to the garbag* collection process as normal live 
objects, and are maintained through each garbage collection cycle- These 

15 objects are eventually deleted in the refresh heap step 445 of the JVM 

reset (see Figure 4) , which will be described in more detail below. 

Returning now to the question of allocating heap space from the 
overall memory region 560 , which contains both the middleware and transient 
20 sections, the procedure for this is illustrated at a high level in Figure 7 

(at this level the same general policy is used for both the middleware and 
transient heaps, although as will be seen below, there are some significant 
differences in the details of their respective policies) . The process 
starts with an allocation request {step 705) , typically to store an object 
25 on the heap. This cquocc the free chain block: (see Figure 5) for the 

relevant heap section to be examined; if there is available space (step 
715) , then the method proceeds directly to allocating the desired space 
(step 795) , and exits successfully. 

30 On the other hand, if the tcct of step 71S i = negative, then it means 

that the heap is too full to sustain the new allocation. This is equivalent 
conceptually to the fill level 513 in Figure 5 approaching the assigned 
boundary 512 for the middleware heap, or fill level 523 approaching 
assigned boundary S2 2 fcr the transient heap. In this situation, the system 

35 first determines whether it is possible to simply expand the amount of 

space assigned to the heap (step 725) . In simple terms, for the middleware 
heap this corresponds to moving- Assigned boundary 512 upwards into the 
unassigned region 515, thereby taking some of the unassigned storage and 
allocating it to the middleware heap 510; conversely for the transient 

40 heap, boundary 522 is moved downwards. OC course, it is not poocible for 

che middleware heap to encroach into the transient heap or vice versa, so 
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that once the unassigned space 515 has been exhausted, then it is no longer 
possible to expand the heaps further. In a situation where heap space is 
available, then a policy is defined to d*t*rmine the amount of extra space 
to add to the heap. The general policy in the preferred embodiment is to 
5 incrcace the heap so that there is 3 0% free space (taking into account the 

new allocation request) - However, a predetermined minimum cxpanoion sis© 
iss defined (0-5 MByte in the preferred embodiment), so that the expansion 
is actually 30$ or 0.5 MByte, whichever is greater (subject: of course to 
the amount of space available) . Likewise, the user may also set a maximum 
10 e^cpaTi=sion size, which is then -used to cap the figure just obtained 

{providing it does not prevent satisfying tne current allocation request) . 
Finally, in the preferred embodiment, heap memory is always 
assigned/deassigned_ in units _o.f a predetermined size., which for a 32 -bit 
system is 64 Kbytes for reasons that will be described later. Therefore 
15 whatever expansion value is determined based on the 20% expansion, this is 

adjusted to the appropriate whole number of 64 Kbyte units. Note that in 
the preferred embodiment, there are further controls on how the different 
heaps are allowed to expand; these are discussed below in more detail. 

2 0 After the available expansion Space has been determined, it is tested 

whether there will now be sufficient space to satisfy the allocation 
request (step 735) . If eo, the relevant heap is? duly expanded (step 785) , 
if not, the method proceeds to step 745, and a garbage collection is 
performed, it ie now checked whether or not this has created sufficient 

2 5 space (step 755) ; if so, the method proceeds to allocate the requested 

space (step 795) . Note that one minor complexity not shown in Figure 7 is 
that the garbage collection (step 745) may perform both si compact 
operation, and then also try a heap expansion (equivalent to step 785) , if 
these are ncccccary to obtain the requested space. If on the other hand 

30 there is still insufficient space for the allocation request, then as a 

final measure, it is possible to shrink the other heap (step 765) - Thus 
referring back to Figure 5, it can be seen that middleware heap could in 
principle lose the assigned but empty space between boundary 512 and fill 
level 513, by lowering boundary to fill level 513. Th* reclaimed space 

35 could then be transferred to the transient heap 52 0 (assuming that it 

already now extended through the region 5lS shown in Figure 5 as 
unassigned) . Conversely, space could be made available for transfer from 
the transient heap to the middleware heap by raising boundary 522 towards 
fill level 523. 

40 
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Following the shrinkage of the other heap (step 765) a test is now 
made to see if this has created sufficient space for th. allocation request 
<- tep 77S) .• if not the system must return an error to the allocation 
request (step 780) indicating that no space is available. Assuming however 
that space is available, then the heap for whicn the allocation reject is 
made can expand (step 785) into the space vacated by the shrinkage of the 
other heap, thereby allowing the allocation >«i»e.t to be .aci.fi«d (step 

795) . 

It will be appreciated that there are many possible variations on the 
processing shown in Figv^e 7. For example figure 7 shows heap expansion 
(step 785) only when this will positively provide the required space (i« 
following a positive result fro™ the tests of Steps 735, 76S, 775), but it 
will be appreciated that such heap expansion might be performed 
irrespective of whether or not this would create sufficient space lor the 
allocation request for some or *il of these test*, in fact, in the 
preferred embodiment, after garbage collection has been performed (stop 
745). the relevant heap will automatically try to expand to give 30% free 
space as previously described, even when the allocation request has already 
been satlS fi e d (thie is subject to certain limitations described in more 
detail below) . 



40 



In addition, an attempt could be made to shrink the other heap (step 
765) before performing garbage collection (step 745) , or it may occur 
automatically »o P*rt of the garbage collection process. Thus in the 
preferred embodiment, the assigned boundary for the transient heap din* 
522 in Figure 5) is shrunk as much as possible each time the heap has been 
compacted, providing that this does not reduce the transient heap below its 
initial size. In contrast, although the middleware heap is also shrunk 
after compaction in the preferred embodiment, in general some leeway (such 
as 30% free space) is left between the heap boundary and the fill level- 
The middleware heap is also never reduced below its original size. This 
policy balances the fact that the transient heap is allowed to grow more 
easily than tha middleware heap (as discussed below) . More generally, such 
shrinkage after compaction returns storage to the unassigned pool, and so 
increases flexibility for managing storage requests from the cwo heaps. 
Note that because in the preferred embodiment shrinkage is performed (if 
possible) after compaction, which in turn will be performed if the garbage 
collection doec not otherwise satisfy the allocation request, then to some 
extent steps 745 and 765 in Figure 7 are effectively amalgamated together. 
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Although the processing shown in Figure 7 applies at a high level, 
there are important differences in detail as regards the management of the 
cransi enc and middleware heaps- The policies adopted reference a location 
565 which represents che midpoint between the middleware heap boundary 512 
and chc transient h* A p boundary 522 (see Figure 5), as determined at JVM 
start-up or JVM reset. Thus for the middleware heap, che procedure is 
expand the heap rather than garbage collect, using the expansion criteria 
described above, until the heap would expand past the midpoint location 
56S. If this situation does arise, then the system uses a smaller 
expan5i cm increment, namely th« minimum expansion value (ie 0.5 Mbyte in 
the preferred embodiment) . Finally, if even this reduced expansion would 
still take the middleware heap past the midpoint, then a garbage collection 
is performed (ie step 745) , rather than allowing the middleware heap to 
expand further. As previously indicated, a compaction will be performed 
here if necessary to satisfy the allocation request. After the garbage 
collection, the system will then try to expand the middleware heap """9 
the standard policy based on 3 0% free space, or the minimum expansion value 
of 0.5 Mbyte if the 30% expansion would exc«d the midpoint. In other 
words, the policy is to try to prevent the middleware heap from expanding 
past midpoint 565 (although thie may happen eventually if the garbage 
collection does not reclaim sufficient space) . The rationale behind this is 
to try to avoid caking up S p*ce from the transient heap, a particular 
concern being the possibility of a long-lived middleware object becoming 
pinned high ~ P (in the sense of Figure 5) in the heap storage 560, and 
therefore seriously limiting the amount o£ space available to the transient 
heap . 

Considering now the transient heap, then once this reachea (or would 
reach) the tnidpoint 565. chen again the expansion rate for this heap is 
reduced to half the minimum expansion value. However, unlike for the 
middleware heap, this expansion is allowed to continue on past the 
midpoint, until eventually all usable heap space is exhausted, when clearly 
* garbage collection will be needed. The motivation here is that it is 
expected that most new objects for the transaction will be created on the 
transient heap, so that this requires most room. Moreover, since the 
transient heap will be deleted anyway at the conclusion of the transaction, 
the concern about pinned objects is reduced (or the JVM will become dirty, 
as discus «ed in more detail below) . A further consequence of this is that 
there is a general desire for performance reasons if possible to avoid a 
garbage collection during a transaction, but rather to postpone this if 
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possible until the heap refresh (step 445, see Figure 4) performed as part 
of che JVM reset. 

With reference to step 765 in Figure 7 (shrinking the other heap to 
S reclaim space) , this step is not performed for an allocation request to the 

transient heap (in other words, a No from step 7S5 would go straight to 
Error 780) . However, it will be noted that ir the allocation request is 
about to £ax1, the heap would already have been garbage collected and 
compacted, and the size of the heaps shrunk as per the policy discussed 
10 above, so that the amount of free space available to reclaim anyway is very 

limited. However, step 765 in Figure 7 is performed for *n allocation 
request to the middleware heap, in order to try to reclaim space from the 
transient heap. The effect of this, if successfuly, would generally be to 
reduce the transient heap below its original size. 

IS 

As one minor subtlety on the above, in the preferred embodiment , tha 
midpoint position is recalculated when the middleware heap is shrunk (but 
not when the transient heap size is altered, or when the middleware heap is 
enlarged) , the new position being halfway between the current middleware 
20 heap boundary and th* current transient heap boundary . This attempts to 

provide some tuning of the space allocation between the two heaps, although 
many other algorithms could be considered as the basis for the control 
procedure . 

25 One complication that prices from effectively having multiple heaps 

of various sizes is that it becomes more complex to determine whether or 
not a given object reference is within a heap (as required, for example, 
for step 612 of Figure 6A) , and if so which one (in case, for example, they 
Have different garbage collection policies) . One possibility is to compare 

30 the reference with the information in the heap control block 530 (see 

Figure 5) . However, with multiple heaps, and also a system heap which is 
not necessarily contiguous, this becomes a time-consuming operation. 

in order to overcome this problem, the preferred embodiment adopts 
35 che approach illustrated schematically in Figure 8. As shown, system 

address space or virtual memory 6 00 is split into chunks of a standard 
size, referred to herein as slices 803- As previously mentioned, in the 
preferred embodiment on a 32 bit system, these slices are each 64KBytes in 
size. The /slices can be numbered linearly as shown with increasing address 
40 space. The heaps can then be allocated out of these slices, in such a way 

that heap space is always allocated or deallocated in terms of an integral 
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number of slices. Figure 8 shows three different heaps (for simplicity 
termed A , B and C) , whereby heap a ia non-contiguous and comprises slices 
3-4 and 6-7, heap B comprises slice 9, and heap C is contiguous ana 
comprises slices 12-14 inclusive. Note thac two or more of these heaps may 
5 possibly be being managed as single block of storage (le in the =ame manner 

to the transient »nd middleware heaps of Figure 5) . 

A l*o illustrated in Figure 8 is lookup table 825, which has two 
columns, the tirst 830 representing slice number, »nd tho second 831 
io representing heap number. Thus each row of the cable can be used to 

determine, for the relevant dice, which heap it is in - a value of zero 
(indicated by a dash) is assumed to indicate that the slice is not 
evidently in a h*&p - The system updates table 825 whenever slices are 
allocated to or deallocated from the heap . 

15 

using table 82 5 it now becomes very cjuick to determine whether a 
given memory address is in a heap. Thus an initial determination is made of 
the relevant slice, by dividing the given memory location (minus the system 
base memory location if non-zero) by the slice size, and rounding down to 
20 the next integer (ie truncating) to obtain the slice number. This can then 

be used to directly access the corresponding heap identifier in column 831. 
In fact, it will be appreciated that column 830 of Table 825 does not need 
to be stored c^cplicitly, since the memory location of each entry in column 
831 is simply a linear function of slice number. More specifically, each 
2 5 entry in column 331 can typically be represented by 1 byte, and so the 

information for slice N can be round at the base location for table 825, 
plus N bytes. Overall therefore, this approach provides a rapid mapping 
from object location to heap identity (if any), ir^cpectiv^ of the number 
of heaps, or the complexity of their configuration. 

30 

One problem however with the technique illustrated in Figure 8 is 
that on 64 bit machines, the virtual memory or address space is so great 
that table 825 would become prohibitively large. Thus in a preferred 
embodiment for such systems, a modified mapping is used, as shown in Figure 

35 9, which has an extra layer in the memory mapping arrangement. In the 

diagram, memory 90 0 represents the system address space or virtual memory, 
which ao in Figure 8 is divided into slices 902 (the difference from Figure 
8 being that on a 64 bit system, address space is much larger, so there are 
many ra or$ slices) . Figure 9 illustrates the location of two heaps, 

4 0 arbitrarily denoted a ana B, with A comprising slices 2-4 inclusive, and B 

comprising slices 1026-1028 inclusive and also slices 9723-9726 inclusive. 
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Also Shown in Figure 9 are cwo lookup cables, 925. 926. each of 
which, for the sake of illustration, contains 2048 entries, ana maps to « 
corresponding range of slices in add«« space 900. Thus lookup cable 925 
maps slices 0-2047. whilst lookup table 926 maps slices 8192-10239 .These 
lookup tables a*e directly analogous Co thac of Figure 8, in that they 
logically contain two columns, the first 930 identifying * «lie« number, 
and eh* second 931 the identity of any heap within that slice (or else 
zero) . Tables 325 and 926 con be regarded a* forming the lower level of the 
lookup hierarchy. 

Figure 9 also depicts a higher layer in the lookup hierarchy, namely 
table 940, which again logically contains two columns. The first column 941 
- logically represents -the- -number, of-lookup table 925, 926 in the next lower 
layer of the lookup hierarchy, whilst the second column 942 contains a 
pointer to the relevant lookup table. Thus the first row of column 942 
contains a pointer 951 to table 925. and the fifth row of column 942 
contains a pointer 952 to table 926. 

xt will be noted that to conserve space, lookup tables in the lower 
level of the hierarchy only =xist where at least some of the corresponding 
alioes are assigned to a heap. Thus for the particular arrangement of 
Figure 9, the lookup tables for S U« 3 2048-4095. 409S-6143. and 6144-8191 
have not been created, since none of these slices has been assigned to any 
heap, in other words, lookup tables 925, 926, etc for various slice ranges 
will be created and deleted according to whether any slices within that 
slice range are being utilised for the heap. If this is not the case, and 
the lookup cable is deleted (or not created in the first place) , the 
pointer in column 942 of top level lookup table 940 is set to zero. 

The operation of the embodiment shown in Figure 9 is analogous to 
that of Figure a. except that there is an extra level of indirection 
involved in the hierarchy. Thus no determine whether a particular reference 
or address is within a heap, the correct row is determined based on a 
knowledge of the size of a slice 902, and also the number of rows in each 
35 lower level lookup table 925. 926. It is expected that for most rows, the 

corresponding entry in column 942 will be null or zero, immediately 
indicating that that address is not in a heap slice. However, if the lookup 
s-lectc a row which has a non-zero entry, this is then followed (using 
pointer 951, 95Z or equivalent) to the corresponding lookup table. The 
desired entry is then found by locating the row using Che reference under 
investigation (allowing for which particular lookup table is involved) , and 
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examining the entry for that row in column 931. This will indicate directly 
whether or not the slice containing che referenced location is irx * heap, 
and if *=o. which one. 

As an example of this, to investigate memory address 637*03384 we 
rirst integer divide by 65536 (the size of a slice in the preferred 
embodiment), to give 9727 (truncated), implying we in slice 9727. Next 

w o perfo™ an integer division of 9727 by 2048 (the number of entries m 
each lower level looJc-up table) , no give 4 (truncated) , implying we are in 
the Sth row of column 941. It will be appreciated that we could have got 
nere directly by dividing S3740S3a4 by 134217728 (which equals 2048x55536, 
or in other words, the total number of addresses per lower l=vel lookup 
cable) . m any ev en t, from the 5th row of table 940, it is determined that 
the corresponding entry in column 941 is non-zero, so that the specified 
address may possibly lie in a heap. Accordingly, pointer 952 is followed to 
table 926. Here we can determine that the row of interest is number 1535 
(eaual to 9727 modulo 2048) , from which we can see that this particular 
slice is not, afcer all. P*rt of h»* P - it follows of course that this is 
also true for any address within this slice. 

Note that as for Figure 8, the slice number columns 930 of lookup 
cables 925 , 92S are not in practice needed, since the desired row in column 
931 can be determined directly by using the slice number (modulo 2048) as 
an offset from the base address of the lookup table. Likewise, column 941 
of cable 940 is aloo redunda.t, since the relevant row can be determined 
directly from the address. In fact however, the vast majority of rcws in 
cable 94 0 (column 94 0) are likely to be zero, in which case storing the 
information in some other data structure auch as a linked li« would be 
much more efficient in terms of spac* (but may reduce lookup speed) . 

It will be appreciated that any suitable data structure can be used 
for storing the two levels of lookup information, shown as tables 940, and 
925 926 respectively. It will also be recognised that the sizes <ai*cu ee *d 
with reference to Figures 8 and 9 (a slice size of 65S36 bytes; 2048 slices 
per lower level lookup table) are exemplary only, and can be varied as 
circumstances dictate to optimize performance. 

Returning now to Figure 4, as previously described, at the end of a 
t.an^cbion the transient heap is deleted (equivalent to the refresh heap 
Step 445, performed as pare or the r«eb JVM) . This activity i* generally 
similar co garbage collection, although certain optimizations are possible, 
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and certain additional constraints need to be considered. This process is 
shown in more detail in the flow chart of Figure 10 (which is split for 
convenience into two components, 10A and 10B) - The first step in Figure 10A 
(1005) is to wait for all finalization activity to complete. Thus if there 
5 has been a GC during a transaction then there may be finalizers to be run 

and they must be run before the transient heap c*n be resec, as the 
finalizers could create (or require) other objects. This checking is 
performed by confirming that the reference handler and finalizer thread 
have emptied their respective queues, and chat there are no other 
io in-progress objects (ie the processing of all pending finalization objects 

has been completed) . Next all the locks required for garbage collection are 
obtained, and all other threads are suspended (step 1010) . The system is 
now in cl position to commence deletion of the transient heap - 

15 in order to accomplish this, the stacks and registers of all threads 

are scanned (as for a normal garbage collection) , and if a reference is 
found to the transient heap (step 1015) then the JVM is potentially dirty 
and so cannot be reset. The reason for this as discussed in relation to 
standard garbage collection (Figure 6) is that the references on the stacks 

20 and registers must be treated as live, even though it is not certain that 

they are in fact object references. To firm up on this the references are 
tested to *ee if it is possible to exclude them from being object 
references (step 1020), essentially by using the same three test* €12, 615 
and 620 of Figure 6. In other words, if the possible reference is not on 

2 5 Clie heap, or doers not fall on an 8-byte boundary, or does not correspond to 

an allocated memory location, then it cannot in fact be a reference. 
Otherwise, the register or stack value may still be a reference, and so 
processing has to exit with an error that the Jvm is dirty and cannot be 
reset (step 1099) . Note that references from the stacks or registers to the 

3 0 middleware or system heap are of course acceptable, because objects on 

these heaps are not being deleted. 

It will be appreciated that based on the above, a spurious data value 
in a stack or register will sometimes prevent JVM reset. However this 

3 5 happens relatively infrequently in practice, because all but the main 

application thread and certain system threads should have terminated at 
this point, so the stacks arc relatively empty (rib the policy adopted in 
the preferred embodiment is that a JVM cannot be reset if more than a 
single transaction thread was uced; multiple middleware threads are 

40 tolerated providing they have terminated by the completion of the 

middleware tidyups) . Related to this, as previously mentioned finalizer 



Received 06-11-00 16:03 From-01962 818927 To-THE PATENT OFFICE Page 41 

i 

A 



INUU 'WW ItD^IY rKLin UK 1KLHU) 



IU UKHJ FRXBRCK P. 42 



GB920000101GB1 35 

objects on the transient heap are retained in that heap until a JVM reset. 
This means that references to such objects are not entered onto the stack 
for the £irwlizer thread, which would otherwise typically cause the reset 
to fail at steps 1015 and 1020 (this would he the case even where the 
finalize method for the object had been finished, since this would not 
necessarily lead to complete deletion of the corresponding stack entry; 
rather the finalizer thread may enter a function to wait for more work, 
resulting in uninitialized areas on the stack which may point to previously 
processed finalizer objects) . 



It is important to note that error 10 99 indicating that *ene JVM is 
dirty does not imply that previous processing was incorrect, merely that 
the JVM cannot. be reset {although of course this may in turn indicate some 
unexpected action by the application) . In other words, a new jvm will need 
to be created for the next application. Because of this, if it is detected 
that the JVM is dirty, such as a negative outcome at step 1020 , th« method 
normally proceeds immediately to step 1099. This returns an error code to 
the reset JVM request from the middleware, with no attempt to continue to 
perform any further garbage collection. The reason for this is that the 
middleware may want to do a little more tidying up, but generally it is 
expected that it will terminate the current JVM fairly quickly. Hence there 
is unlikely to be a need for any further garbage collection, which rather 
would represent an unnecessary waste of time. A similar policy is adopted 
whenever the processing of Figure 10A indicates that the JVM is dirty. 

Assuming now a negative result from step 1015 or 1020, the jvm 
refresh continues with an examination of the primordial statics fields 
(Step 1025) to see what objects tney reference, since these fields will be 
retained through the JVM reset, it is important that the objects that they 
reference^ either directly or indirectly, are likewise retained. If however 
the referenced objects are application objects (tested at step 103 0) then 
clearly these cannot be retained, because the application has essentially 
terminated, and the purpose of resetting the JVM is to allow a new 
application to commence. Therefore, if the primordial statics do reference 
an application object, then the JVM is marked as dirty, and the method 
proceeds to error 1099. 

Assuming that the objects referenced by the primordial static fields 
arc not application object*? (typically they will be primordial object 
instances or arrays), then these are moved ("promoted"*) from the transient 
heap to the middleware heap {step 1035) . The reason why such objects are 
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placed on the transient heap initially is that at allocation time, it may 
not be known that the otrject to be allocated is a primordial static 
variable, or reachable from one. 

(Note that this approach bears some similarities to generational 
garbage collection, in which new objects are initially allocated to a 
short-term heap, and then promoted to a longer-term heap if they survive 
beyond a certain time, but the criterion for promotion is different: 
essentially it is based on object type or usage, rather than age. 
Generational garbage collection is discussed further in the book by Jones 
and Lin referenced above) * 



one complication (not shown in Figure 10) is that promoting an object 
from the transient heap to the middleware heap may lead to an allocation 
15 failure on the middleware heap if space is exhausted. In such an 

eventuality, a garbage collection is performed. If this still does not 
create enough space, then -his will lead to error 1099. 

After the primordial static objects have been promoted, the next step 
20 is to review the card table (536 - see Figure 5). The card table represents 

a set of bytes, one per fixed unit of heap (for example 512 bytes) . 
whenever an object reference is written to the heap, the card table is 
updated to indicate dirty (nb marking a card as dirty does not imply that 
the JVM itself is necessarily dirty) . The card updated corresponds not to 
25 the portion of the heap which contains the updated object reference itself, 

but rather to the portion of heap which contains the top o£ the object that 
includes the the reference (for a small object these may of course be the 
same) . Given tnat updating object references is a frequent operation, the 
card table must operate very quickly. This is the reason why eacn card is a 
30 byte despite containing only a single bit of information, because in 

practice this can be manipulated more quickly. Furthermore, no attempt at 
this write stage is made to investigate the nature of the reference update, 
for example whether the reference was set to a null value, or to an object 
in a particular heap. 



40 



Now during JVM reset the card table is scanned, or more particularly 
those cards which correspond to the region currently assigned to the 
middleware heap are scanned. Thus cards for the transient heap 52 0 and for 
the unassigned region 510 are not scanned, even if they have previously 
been part or the middleware heap. As part of this review, it is first 
determined whether any cards are set (ie marked as dirty) (step 1045) . This 
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indicates Chat a reference in the corresponding portion of the middleware 
heap has been updated since Che last JVM reset, and so mu=t be checked to 
confirm th«b it does not point to the transient heap. The first part of 
this check is to find all object references in objects which start in the 
heap portion corresponding to the marked card- Note that there may be more 
than one object co review here, or P o SS ibiy none at all if the object 
previously located there has since been garbage collected and the space 
reused hy a larger object whose beginning is situated outside that portion 
of the heap. For all objects associated with » marked =«d, all r e f B re»c« 
contained in those objects (even if the references themselves are outside 
the portion Of the heap corresponding to the card) ar* checked to see if 
they point to the transient heap (step 1050) . If they do not. for example 
c ney. contain only null pointers, and/or references to the middleware heap, 
then this is not a problem for JVM reset. On the other hand, it chore are 
15 any such pointers to the transient heap from the middleware heap, this will 

be a problem on reset since those references will no longer be v*iid once 
the transient heap is cleared. The one exception to this is where the 
Objects containing these problematic reference are no longer live (ie 
could be garbage collected) . 
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Therefore, on a positive outcome to step 10S0. the system performs 
the mark phase of a garbage collection (step 1055). which is a relatively 
long operation. IC the problematic references are in objects v*xieh are 
marked (ie live), as tested at step 1060, then they are indeed problematic, 
so the JVM must be regarded as dirty: fa«T»e« the method proceeds to error 
1099. On the other hand, if the problematic references are in objects which 
are not marked, then they can effectively be ignored, since these objects 
are no longer live. 

Note that if the heaps have been compacted during a transaction, then 
this invalidates the card table. In such cases a full scan of the 
middleware heap is required to locate an object references to the transient 
heap, equivalent to the garbage collection mark phase of step 1055 if any 
such references are found. 

Assuming that the test of step 1060 produces a negative output (ie no 
live middleware references to ehe transient heap) , the method proceeds to 
scan JN1 global references. These are references which are used by native 
code routines (ie running directly on OS 30 rather than on JVM 40, see 
Figure 1) to refer to Java objects. Using the Java Native interface (JNi) 
such references can be made global, that is available to all threads, in 



Received 06-1 1 -00 16:03 



From-019S2 818927 



To-THE PATENT OFFICE Page 44 



Wb NUKj 'IdW lb :1b" hKUH UK ih'LHUI 



!U UKKU hHXbHCK K.4^> 



GB920000101GB1 38 

which case they will exist independently of the thread that created them. 
All such JNI global reference slots are scanned (step 1065) (see Figure 
10B) and if a reference no the transient heap is found (step 1070) the JVM 
is marked as dirty <ie error 1099) , since these references will clearly 
fail once the transient heap is reset. 

Providing this is not the case, the CTNI weak references arc canned 
n**t (step 1072), These are references which the application specifies 
using JNI as expendable, in that they can bo deleted if no longer used. 
According, any sucn W eak J**I references to the transient heap that are 
femnd can be nulled (etep 1074), thereby permitting the JVM reset to 
proceed . 

jJext, the static variables of all middleware classes arc scanned 
(step 1076) to see if any directly reference the transient heap (step 
1073) . Note that these won't previously Have b«©n examined, cince they are 
on the system heap rather than the middleware heap. If a direct rererence 
to chc transient heap is found, the JVM is dirty, corresponding to error 
1099. (Note that unlike for the primordial statics (step 1025) cherc is no 
need to iceratively follow references from the middleware statics, since 
any indirect references will already have been picked up by preceding 
analysis) _ If no transient heap references are found, the processing 
continues to step 10 8 0 in which objects on the transient heap are reviewed 
to see if any have finalizer methods., and any that are found are now run 
Utcp 1082) - One important aspect of the preferred embodiment is that these 
finalizer methods are run on the main thread, rather than being passed to 
the system finalizer. An implication of this is that the finalizer methods 
will be run in the known and controllable context of the main thread. In 
addition, it is ensured that the finalizer methods complete before 
progressing to the next stage of the JVM reset. Unfortunately, finalizer 
methods can create fresh objects, which may newly reference the transient 
heap. Therefore, after the finalizer methods have completed, processing 
must return to step 1025 to repeat much or the checking, to ensure that the 
eyj , teTn is still in a position for JVM reset. In theory, if the finalizer 
methods have created new objects on the transient heap which themselves 
have finalizer methods, then this loop may have to be followed more than 
once . 

Note that strictly speaking there is no formal requirement to run the 
finalizers at this stage, since this is the point at which the JVM would 
normally terminate at the conclusion of an application, rather than having 
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a garbage collection performed. Nevertheless, the policy in the preferred 
embodiment is that object finalize™ will be run before delation at JVM 
reset, although other implementations may have different policies. 

It is assumed that eventually all finalizers will be run, resulting 
in a negative outcome to the test of step 1080. In these circumstances, the 
method proceeds to step 1085, which represents reset of the jvw by delating 
t he transient heap. In practice, this involves several operations. Firstly, 
if the marx phase of the garbage collection was run (step 1055) then the 
sweep phase, which is relatively quick, is now run on the middleware heap. 
Next, various operations are performed to formally reset the transient 
heap, including: the removal of all transient heap monitors and the freeing 
of storage for transient heap class blocks (ie releasing the storage 
utilised by the class block, which is not on the heap) - The transient heap 
pointers can now be reset so that the heap is effectively emptied, and 
restorea to its initial size (by setting boundary 522 appropriately) . 



in the preferred embodiment it is declared that the transient heap 
will be set to the same initial size for each transaction. One potential 
2 0 problem with honouring this is that the middleware heap may have expanded 

during the previous application, and then retain this space through a reset 
of the JVM, Since there is no constraint on the transient heap shrinking 
below its initial size, to surrender space to the middleware heap if 
required, this can in turn make it impossible for the transient heap in the 

2 5 next incarnation of the JVM to be set to the same initial size as the 

current transient heap. If this problem arises, a specific attempt is made 
to shrink the middleware heap sufficiently to accommodate the correct 
initial size of the transient heap- However, if this attempt is 
unsuccessful, the JVM must be marked as dirty, and cannot be rece:. to its 

30 initial state. 

once the transient heap has been recreated (although it could be done 
before) , a garbage collection is performed on the middleware heap if 
either of the following two cases is true: firstly, if the number of 

35 slices left in the unallocated portion of the heap, between the middleware 

heap and the transient heap, is less than two, or secondly if the amount of 
free space in the middleware heap plus half the unassigned portion 515 of 
the heap (see Figure 5) is less than the amount of storage used by the 
previous transaction times three. Both of these can be regarded as a 

40 preemptive garbage collection, performing this operation now if the next 

transaction is otherwise likely to be constrained for space, in the hope 
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that this will avoid a garbage collection during the transaction itself. 
Note that in the current implementation this preemptive garbage collection 
would be performed irrespective of whether a garbage collection mark phase 
was performed in step 1055. Finally, all the threads can be restarted and 
the garbage collection locks released, whereupon the reset is completed, 
and the JVM is available to support the next application. 

Trie skilled person will be aware of many possible variations on the 
embodiment described above. The invention has been described primarily in 
relation to Java in a server environment, but it will be understood that it 
applies to any other language with similar properties (possibly c# from 
Microsoft Corporation) , and is also potentially applicable to the client 
embodiment; such as when- it is necessary- to have a quick ctart-up^of 
applications, m addition, many of the details of the systems and processes 
utilised are exemplary only, and can be varied according to particular 
circumstances. Thus other modifications to the embodiments described herein 
will be apparent to th<& skilled person yet remain within the scope of the 
invention as set out in the attached claims . 



Received 06-11-00 16:03 From-01962 818927 To-THE PATENT OFFICE Page 47 



Wb NUJ 1 WW lb!iy hKUn UK 1HLRW 



IU UKMU KHXbHCK P. 4b 



GB920000101GB1 41 

CLAIMS 



1. A computer system providing an object-based virtual machine 
environment for running successive applications, said computer system 
including storage, at least a portion of which is logically divided into 
two or more heaps in which objects can b* stored, wherein a first heap is 
reset between successive applications, and a second heap persists from one 
application to the next , said system including: 

a card table comprising multiple cards, each corresponding to a 
region o£ said storage, each card in the card table being set to null when 
the first heap IS reset between successive applications; 

means for marking a card whenever an object in its corresponding 
Storage region is created or updated; and 

means for detecting possible references £rom the second heap to the 
first heap at reset by scanning the cards in the card table corresponding 
to the second heap, and detecting any cards which have be«n marked . 

2. The computer system of claim 1, further comprising: 

means for locating, for each marked card, any objects in the 
corresponding region of storage; and 

means for identifying any references no the first heap in the located 
ob j acts . 

3. The computer system of claim 2, further comprising: 

means responsive to the identification of references to the first 
heap for performing the mark phase of a garbage collection to determine 
live objects in at least the second heap; 

means for detecting whether any objects in the second heap having 
references to the first heap have been marked as live; and 

means responsive co a detection o£ any such objects for returning an 
error condition to prevent reset for another application. 

4. The computer system of claim 3, further comprising means for 
invalidating the card table if a compact operation has been performed on 
the second heap since the last reset, wherein said means for performing the 
mark phase is also responsive to invalidation of the card table. 

5. The computer system of claim 1 # wherein an object is only considered 
as within the region of storage corresponding to a card if a predetermined 
part of the object is in that region. 
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6. The computer system of claim 1, wherein the region of memory 
corresponding to a card comprises between 2SC and 2048 bytes, 

7. The computer system of claim 1, further comprising; 

means for detecting references or possible references to the first 
heap from a set Of predetermined locationc; arid 

means responsive to the detection of any such references or possible 
references for returning an error condition to prevent reset for another 
application. 

8. The computer system Of Claim 7, wherein aaid set o£ predetermined 
locations includes the stacks and registers. 

9. The computer system of claim 1, further comprising: 

means for detecting any objects on the first heap which are reachable 
from virtual machine system class objects,- and 

means for promoting any such detected objects to che second heap. 

10. A computer system providing an object-based virtual machine 
environment for running ouccassive application?, c*id computer system 
including storage, at least a portion of which is logically divided into 
two or more heaps in which objects can be stored, wherein a first heap is 
reset between successive applications, and a second heap persists from one 
application to the next, said system including: 

means for identifying any objects on the first h©a.p which hava * 
finalization method; and 

means for running the finalization methods o£ any identified objects 
on the main thread prior to reset of the first heap. 

11. The computer system or claim 10, further comprising means responsive 
to running said finalization methods for checking that they have not 
performed any operations which would prevent reset of the first heap. 

12. A method of operating a. computer system providing an object-based 
virtual machine environment for running successive applications, said 
computer system including storage, at least a portion of which is logically 
divided into two or more heaps in which objects can be stored, wherein A 
first heap is reset between successive applications, and a second heap 
persists from one application to the next, said method including the steps 
of : 
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providing a card cable comprising multiple cards, each corresponding 
to a region or said storage, each card in bh« card tAbie being set to null 
when the first heap is reset between successive applications; 

marking a card whenever an object in its corresponding storage region 

S is created or updated; and 

detecting possible references from the second heap to the first heap 
at reset by scanning the cards in the card table cor responding to the 
second heap, and detecting any cards which have been marked. 

10 is. The method of claim 12, further comprising: 

locating, for each marked card, any objects in the corresponding 
region of storage; and 

identifying any references to the first heap in the located objects. 

15 i4. The method of claim 13, further comprising: 

responsive to the identification of references to the first heap, 
performing the tuark phase of a garbage collection to determine live ~bjecc S 
in at least the second h<*ap ; 

detecting whether any objects in the second heap having references to 
2 0 the first heap have been marked as live; and 

responsive to a detection of any such objects, returning an error 
condition to prevent reset for another application. 

15. The method of claim 14, further comprising the step of invalidating 
25 che card tahl* if a compact operation has been performed on the second heap 

since the last reset, wherein said step of performing the mark phase is 
*l1*o performed in response to invalidation of the card table. 

16. The method of claim 12, wherein an object is only considered as 
within the region of storage corresponding to a card if a predetermined 
part of the object is in that region. 

17. The method Of claim 12, wherein the region of memory corresponding to 
a card comprises between 256 and 2048 bytes. 

18. The method of claim 12, further comprising: 
detecting r*£*r*encee or possible references to the first heap from a 

set of predetermined locations; and 

responsive to the detection of any such references or possible 
references, returning an error condition to prevent r«cef for another 
application. 
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19. The method of claim 18, wherein said sec of predetermined locations 
includes the stacks and registers. 

20. The method of claim 12, further comprising; 

S detecting any objects on the first heap which are reachable from 

virtual machine system class objects; and 

promoting any such detected objects to the second heap. 

21. A method of operating a computer system providing an object-based 
XQ v i rtU al machine environment fioar running successive applications, said 

computer system including storage, at least a portion of which is logically 
divided into two or more heaps in which objects can be stored, wherein a 
first heap is reset between successive applications, and a second he^p 
persists from one application to the next, said method including the steps 
15 of: 

identifying any objects on the first heap which nave a f inalizacion 
method; and 

running the finalization methods of any identified objects on the 
thread prior to reset of the first heap. 

20 

22. The method of claim 21, further comprising the step, responsive to 
running said finalisation methods, of checking that they have not performed 
any operations which would prevent reset of the first heap. 

25 23. A computer program for implementing the method of any of claims 12 to 

22 . 
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COMPUTER SYSTEM WITH HEAP RESET 



20 



mcj 

or more 



ABSTRACT 

A computer system provides an object-based virtual machine 
environment for running succeasivc application*. The computer system 

ludes storage, at least a portion of which is logically dividea into two 
^ aps in which objects c*n be scored- A first heap is reset between 
successive applications, and a second neap persists from one applicacion to 
10 the next, a wrd table is provided which comprises multiple cards, each 

corresponding to a region or said stooge- Bach card in the card t*ble is 
set to" null when the first heap is reset between successive applications. A 
card is marked whenever an object in its corresponding storage region is 
created or updated. It is then possible to detect potential references 
15 the second heap to the first heap at reset by scanning the cards in the 

card table corresponding to the second neap, and detecting any cards which 
havo been marked. 

The system further identifies any objects on the first heap vnich 
have a f inclination method. The f inaliz^tion methods of any such 
identified objects are then run on the main thread prior to reset of the 
first heap- 
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